G2p conversion of names: what can we do (better)?

نویسندگان

  • Henk van den Heuvel
  • Jean-Pierre Martens
  • Nanneke Konings
چکیده

In this contribution it is shown that a good approach for the grapheme-to-phoneme conversion of proper names (e.g. person names, toponyms, etc), is to use a cascade of a general purpose grapheme-to-phoneme (G2P) converter and a special purpose phoneme-to-phoneme (P2P) converter. The G2P produces an initial transcription that is then transformed by the P2P. The latter is automatically trained on reference transcriptions of names belonging to the envisaged name category (e.g. toponyms). The P2P learning process is conceived in such a way that it can take account of high order determinants of pronunciation, such as specific syllables, name prefixes and name suffixes. The proposed methodology was successfully tested on person names and toponyms, but we believe that it will also offer substantial reductions of the cost for building pronunciation lexicons of other name categories.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

G2P Conversion of Proper Names Using Word Origin Information

Motivated by the fact that the pronunciation of a name may be influenced by its language of origin, we present methods to improve pronunciation prediction of proper names using word origin information. We train grapheme-to-phoneme (G2P) models on language-specific data sets and interpolate the outputs. We perform experiments on US surnames, a data set where word origin variation occurs naturall...

متن کامل

Basis Identification for Automatic Creation of Pronunciation Lexicon for Proper Names

Development of a proper names pronunciation lexicon is usually a manual effort which can not be avoided. Grapheme to phoneme (G2P) conversion modules, in literature, are usually rule based and work best for non-proper names in a particular language. Proper names are foreign to a G2P module. We follow an optimization approach to enable automatic construction of proper names pronunciation lexicon...

متن کامل

Do we need bigram alignment models? On the effect of alignment quality on transduction accuracy in G2P

We investigate the need for bigram alignment models and the benefit of supervised alignment techniques in graphemeto-phoneme (G2P) conversion. Moreover, we quantitatively estimate the relationship between alignment quality and overall G2P system performance. We find that, in English, bigram alignment models do perform better than unigram alignment models on the G2P task. Moreover, we find that ...

متن کامل

How do you pronounce your name? Improving G2P with transliterations

Grapheme-to-phoneme conversion (G2P) of names is an important and challenging problem. The correct pronunciation of a name is often reflected in its transliterations, which are expressed within a different phonological inventory. We investigate the problem of using transliterations to correct errors produced by state-of-the-art G2P systems. We present a novel re-ranking approach that incorporat...

متن کامل

Development of a phoneme-to-phoneme (p2p) converter to improve the grapheme-to-phoneme (g2p) conversion of names

It is acknowledged that a good phonemic transcription of proper names is imperative for the success of many modern speech-based services such as directory assistance, car navigation, etc. It is also known that state-of-the-art general-purpose grapheme-to-phoneme (g2p) converters perform rather poorly on many name categories. This paper proposes to use a g2p-p2p tandem comprising a state-ofthe-a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007